Search CORE

37 research outputs found

Energy saving models for wireless sensor networks

Author: APILETTI D
BARALIS E
CERQUITELLI T.
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

Evaluating espresso coffee quality by means of time-series feature engineering

Author: Apiletti D.
Baralis E.
Callà R.
Pastor E.
Publication venue: CEUR-WS
Publication date: 01/01/2020
Field of study

Espresso quality attracts the interest of many stakeholders: from consumers to local business activities, from coffee-machine vendors to international coffee industries. So far, it has been mostly addressed by means of human experts, electronic noses, and chemical approaches. The current work, instead, proposes a datadriven analysis exploiting time-series feature engineering.We analyze a real-world dataset of espresso brewing by professional coffee-making machines. The novelty of the proposed work is provided by the focus on the brewing time series, from which we propose to engineer features able to improve previous data-driven metrics determining the quality of the espresso. Thanks to the exploitation of the proposed features, better quality-evaluation predictions are achieved with respect to previous data-driven approaches that relied solely on metrics describing each brewing as a whole (e.g., average flow, total amount of water). Yet, the engineered features are simple to compute and add a very limited workload to the coffee-machine sensor-data collection device, hence being suitable for large-scale IoT installations on-board of professional coffee machines, such as those typically installed in consumer-oriented business activities, shops, and workplaces. To the best of the authors' knowledge, this is the first attempt to perform a data-driven analysis of real-world espresso-brewing time series. Presented results yield to three-fold improvements in classification accuracy of high-quality espresso coffees with respect to current data-driven approaches (from 30% to 100%), exploiting simple threshold-based quality evaluations, defined in the newly proposed feature space

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

In-Network Outlier Detection in Wireless Sensor Networks

Author: A Beck
A Cerpa
Boleslaw Szymanski
Chris Giannella
D Apiletti
D Krivitski
G Tietjen
H Fan
Hillol Kargupta
IF Akyildiz
IF Akyildiz
Joel W. Branch
K Bhaduri
K Das
K Holger
L Chen
M Bawa
M Mehyar
M Otey
P Gupta
R Wolff
R Wolff
Ran Wolff
S Basu
S Chong
S Mukherjee
V Barnett
V Hodge
W Mebane
X Sheng
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 03/09/2009
Field of study

To address the problem of unsupervised outlier detection in wireless sensor networks, we develop an approach that (1) is flexible with respect to the outlier definition, (2) computes the result in-network to reduce both bandwidth and energy usage,(3) only uses single hop communication thus permitting very simple node failure detection and message reliability assurance mechanisms (e.g., carrier-sense), and (4) seamlessly accommodates dynamic updates to data. We examine performance using simulation with real sensor data streams. Our results demonstrate that our approach is accurate and imposes a reasonable communication load and level of power consumption.Comment: Extended version of a paper appearing in the Int'l Conference on Distributed Computing Systems 200

arXiv.org e-Print Archive

Crossref

Challenges in managing real-time data in health information system (HIS)

Author: A Thusoo
D Apiletti
J Dean
K Kaur
K Rabbi
L-C Huang
M Hussain
N Peek
P Gorp Van
R Cattell
W Raghupathi
W-S Jian
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

© Springer International Publishing Switzerland 2016. In this paper, we have discussed the challenges in handling real-time medical big data collection and storage in health information system (HIS). Based on challenges, we have proposed a model for realtime analysis of medical big data. We exemplify the approach through Spark Streaming and Apache Kafka using the processing of health big data Stream. Apache Kafka works very well in transporting data among different systems such as relational databases, Apache Hadoop and nonrelational databases. However, Apache Kafka lacks analyzing the stream, Spark Streaming framework has the capability to perform some operations on the stream. We have identified the challenges in current realtime systems and proposed our solution to cope with the medical big data streams

ZU Scholars (Zayed University)

Crossref

A feature selection method for classification within functional genomics experiments based on the proportional overlapping score

Author: A Kikuchi
A Statnikov
A Ultsch
Andrew Harrison
Aris Perperoglou
Asma Gul
B Lausen
Berthold Lausen
C Cortes
C Ding
C Ma
C Müssel
C Zou
D Apiletti
D Apiletti
DA Notterman
DeAndresSA Díaz‐Uriarte R
DG Altman
E Baralis
GJ Gordon
H Peng
H‐C Liu
J Fan
J Fan
J Lu
K‐H Chen
L Breiman
L Breiman
L Lausser
M Dramiński
M Marczyk
Metodi V Metodiev
N De Jay
Osama Mahmoud
P Alhopuro
P Laiho
RN Jorissen
RS Croner
RS Croner
S Chiaretti
S Michiels
T Cover
T Jirapech‐Umpai
TR Golub
VG Tusher
W Talloen
Y Saeys
Y Su
Zardad Khan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

Background: Microarray technology, as well as other functional genomics experiments, allow simultaneous measurements of thousands of genes within each sample. Both the prediction accuracy and interpretability of a classifier could be enhanced by performing the classification based only on selected discriminative genes. We propose a statistical method for selecting genes based on overlapping analysis of expression data across classes. This method results in a novel measure, called proportional overlapping score (POS), of a feature's relevance to a classification task.Results: We apply POS, along-with four widely used gene selection methods, to several benchmark gene expression datasets. The experimental results of classification error rates computed using the Random Forest, k Nearest Neighbor and Support Vector Machine classifiers show that POS achieves a better performance.Conclusions: A novel gene selection method, POS, is proposed. POS analyzes the expressions overlap across classes taking into account the proportions of overlapping samples. It robustly defines a mask for each gene that allows it to minimize the effect of expression outliers. The constructed masks along-with a novel gene score are exploited to produce the selected subset of genes

University of Essex Research Repository

Crossref

Springer - Publisher Connector

PubMed Central

Explore Bristol Research

Forecasting: theory and practice

Author: Apiletti D
Assimakopoulos V
Babai MZ
Barrow DK
Ben Taieb S
Bergmeir C
Bessa RJ
Bijak J
Boylan JE
Browell J
Carnevale C
Castle JL
Cirillo P
Clements MP
Cordeiro C
Cyrino Oliveira FL
De Baets S
Dokumentov A
Ellison J
Fiszeder P
Franses PH
Frazier DT
Gilliland M
Goodwin P
Grossi L
Grushka-Cockayne Y
Guidolin M
Guidolin M
Gunter U
Guo X
Guseo R
Gönül MS
Harvey N
Hendry DF
Hollyman R
Januschowski T
Jeon J
Jose VRR
Kang Y
Koehler AB
Kolassa S
Kourentzes N
Leva S
Li F
Litsiou K
Makridakis S
Martin GM
Martinez AB
Meeran S
Modis T
Nikolopoulos K
Paccagnini A
Panagiotelis A
Panapakidis I
Pavía JM
Pedio M
Pedregal DJ
Petropoulos F
Pinson P
Ramos P
Rapach DE
Reade JJ
Rostami-Tabar B
Rubaszek M
Sermpinis G
Shang HL
Spiliotis E
Syntetos AA
Talagala PD
Talagala TS
Tashman L
Thomakos D
Thorarinsdottir T
Todini E
Trapero Arenas JR
Wang X
Winkler RL
Yusupova A
Ziel F
Önkal D
Publication venue: 'Elsevier BV'
Publication date: 20/01/2022
Field of study

Forecasting has always been at the forefront of decision making and planning. The uncertainty that surrounds the future is both exciting and challenging, with individuals and organisations seeking to minimise risks and maximise utilities. The large number of forecasting applications calls for a diverse set of forecasting methods to tackle real-life challenges. This article provides a non-systematic review of the theory and the practice of forecasting. We provide an overview of a wide range of theoretical, state-of-the-art models, methods, principles, and approaches to prepare, produce, organise, and evaluate forecasts. We then demonstrate how such theoretical concepts are applied in a variety of real-life contexts. We do not claim that this review is an exhaustive list of methods and applications. However, we wish that our encyclopedic presentation will offer a point of reference for the rich work that has been undertaken over the last decades, with some key insights for the future of forecasting theory and practice. Given its encyclopedic nature, the intended mode of reading is non-linear. We offer cross-references to allow the readers to navigate through the various topics. We complement the theoretical concepts and applications covered by large lists of free or open-source software implementations and publicly-available databases

UCL Discovery

Business analytics in industry 4.0: a systematic review

Author: Abdelmaguid T.
Abdirad M.
Abdous M.‐A.
Akhtari S.
Alasali F.
Albers A.
Alberto Sala D.
Ali S.
Ansari F.
Antomarioni S.
Apiletti D.
Armbrust M.
Arnott D.
Aydemir G.
Bagheri M.
Bakar N.
Banerjee A.
Barton D.
Birglen L.
Bordel B.
Bordeleau F.‐E.
Borgi T.
Bose S. K.
Bousdekis A.
Brik B.
Bruneo D.
Bányai T.
Calabrese M.
Candanedo I.
Canizo M.
Cao G.
Cao Q.
Charest M.
Chen H.
Chen Y.
Chen Y.‐J.
Chiang L.
Chi‐Hsien K.
Choi W.
Chong D.
Cicconi P.
Cisotto S.
Clegg D.
Costa C.
Costa R.
Davenport T. H.
Diez‐Olivan A.
Duan L.
Durakbasa N.
Dutta R.
Dwaraka R.
ESRTC
Essien A.
European Commission
Fu Y.
Gomes M.
Goodfellow I.
Guo Z.
Haffner O.
He M.
Hesser D. F.
Jugulum R.
Kabugo J. C.
Karakose M.
Kaupp L.
Kharwar P.
Khatri V.
Khayyam H.
Kiangala K.
Kim S.
Kirchen I.
Kitchenham B.
Klement N.
Koch R.
Kohlert M.
Krishnamoorthi S.
Krishnan K.
Kumar A.
Kuo C.‐J.
Kuo H.
Lasi H.
Lee J.
Lee J.
Lee W. J.
Lee Y.‐M.
Leite M.
Lenz J.
Li H.
Li S. C.
Li Y.
Li Z.
Liang Y.
Lin C.
Lin C.
Lin T.
Liulys K.
Lu Y.
Ma C.
Maggipinto M.
Martinek P.
Masoudinejad M.
Massaro A.
Massaro A.
Milošević M.
Miškuf M.
Mohanty S.
Mozgova I.
Muhuri P.
Mulrennan K.
Naskos A.
Negri E.
Neuböck T.
Nikolic B.
Niño M.
Nuzzi C.
O'Donovan P.
Packianather M. S.
Pane Y.
Park C. Y.
Peralta G.
Pierezan J.
Pinto R.
Plehiers P.
Ploennigs J.
Pradhan K.
Proto S.
Qi Q.
Qin J.
Qin J.
Qu S.
Rahman H.
Rendall R.
Richter J.
Rogier J.
Romeo L.
Rosli N.
Rousopoulou V.
Ruiz‐Sarmiento J.
Russom P.
Russom P.
Saldivar A. A. F.
Saldivar A. A. F.
Saldivar A. A. F.
Sanz E.
Saxena V. K.
Sellami C.
Senkerik R.
Sezer E.
Sharp M.
Shrouf F.
Silva D.
Soto J. A. C.
Spendla L.
Stein B. V.
Straus P.
Stürmlinger T.
Subakti H.
Subramaniyan M.
Sun I.
Susto G. A.
Swamy A. K.
Sá A.
Tan Y.
Tang D.
Teschemacher U.
Tieng H.
Tiwari K.
Tjahjono B.
Trunzer E.
Tsai S.
Tsourma M.
Uhlmann E.
Uriarte A. G.
Vathoopan M.
Vazan P.
Ventura F.
Wan J.
Wang Y.
Wang Y.‐M.
Wu W.
Xia F.
Xu L. D.
Xu X.
Xu X.
Yan H.
Yan J.
Yang H.
Yang J.
Yeh W.
Zenisek J.
Zhang Q.
Zhang T.
Zheng M.
Zhong R. Y.
Zhou H.
Öchsner A.
Publication venue: 'Wiley'
Publication date: 01/01/2021
Field of study

Recently, the term “Industry 4.0” has emerged to characterize several Information Technology and Communication (ICT) adoptions in production processes (e.g., Internet-of-Things, implementation of digital production support information technologies). Business Analytics is often used within the Industry 4.0, thus incorporating its data intelligence (e.g., statistical analysis, predictive modelling, optimization) expert system component. In this paper, we perform a Systematic Literature Review (SLR) on the usage of Business Analytics within the Industry 4.0 concept, covering a selection of 169 papers obtained from six major scientific publication sources from 2010 to March 2020. The selected papers were first classified in three major types, namely, Practical Application, Reviews and Framework Proposal. Then, we analysed with more detail the practical application studies which were further divided into three main categories of the Gartner analytical maturity model, Descriptive Analytics, Predictive Analytics and Prescriptive Analytics. In particular, we characterized the distinct analytics studies in terms of the industry application and data context used, impact (in terms of their Technology Readiness Level) and selected data modelling method. Our SLR analysis provides a mapping of how data-based Industry 4.0 expert systems are currently used, disclosing also research gaps and future research opportunities.The work of P. Cortez was supported by FCT - Fundação para a Ciência e Tecnologia within the R&D Units Project Scope: UIDB/00319/2020. We would like to thank to the three anonymous reviewers for their helpful suggestions

Universidade do Minho: RepositoriUM

Crossref

A Survey of Bayesian Statistical Approaches for Big Data

Author: A Akusok
A Baldominos
A Belle
A Beskos
A Bouchard-Côté
A De Mauro
A Fahad
A Gandomi
A Lee
A Lee
A Marshall
A O’Driscoll
A Siddiqa
A Vyas
AB Owen
AF Wise
AR Linero
AT Azar
AT Porter
AT Porter
AÇ Pehlivanlı
B Franke
B Liquet
B Liu
B Oancea
C Loebbecke
C Wang
C Wang
C Yang
CA McGrory
CC Drovandi
CE Rasmussen
Changwon Yoo
CK Emani
D Apiletti
D Oprea
D Talia
DB Dunson
DM Blei
DN Politis
DT Frazier
DV Shah
DW Bates
E Raguseo
ED Schifano
ET Bradlow
F Lindsten
Florian Buettner
Florian Maire
G Bello-Orgaz
G Jifa
GI Allen
GJ Lasinio
GM Allenby
H Cai
H Demirkan
H Hassani
H Kousar
HA Chipman
HH Huang
HJ Watson
I Ben-Gal
J Fan
J Roski
J Zhu
Jake Luo
JE Bibault
JJ Chen
JN Cappella
JS Rumsfeld
K Chalupka
Kath Albury
KL Mengersen
KS Divya
L Breiman
L Liu
L Mählmann
L Wang
L Yu
L Zhang
L Zhou
LG Nongxa
M Hilbert
M Viceconti
MA Suchard
Matias Quiroz
MD Assunção
MD Hoffman
MT Moores
N Moustafa
N. Chopin
NA Lazar
O Sysoev
Oliver Müller
OY Al-Jarrah
P Ducange
P Müller
P Pudlo
PF Brennan
R Bardenet
R Burrows
R Guhaniyogi
R Guhaniyogi
R Guhaniyogi
R Izbicki
RF Mansour
Richard Branch
Robin Genuer
RW Hoerl
S Atkinson
S Castruccio
S Chaudhuri
S Fosso Wamba
S Guha
S Kaisler
S Li
S Minsker
S Pandey
S Sagiroglu
S Sisson
S Srivastava
S Suthaharan
S White
SF Wamba
Shahriar Akter
Shweta Bansal
Simon I. Hay
SL Scott
SM Schennach
Sudipto Banerjee
T Magdon-Ismail
T Zhang
Tengyao Wang
TH McCormick
TJ McKinley
U Sivarajah
VD Katkar
X Zhang
XF Wang
XG Xia
Xing Ju Lee
Y Tang
Y Webb-Vargas
Y Zhang
Yang Ni
YW Teh
Z Ma
Z Sun
Z Zhang
Ziad Obermeyer
Zoubin Ghahramani
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 28/05/2020
Field of study

The modern era is characterised as an era of information or Big Data. This has motivated a huge literature on new methods for extracting information and insights from these data. A natural question is how these approaches differ from those that were available prior to the advent of Big Data. We present a review of published studies that present Bayesian statistical approaches specifically for Big Data and discuss the reported and perceived benefits of these approaches. We conclude by addressing the question of whether focusing only on improving computational algorithms and infrastructure will be enough to face the challenges of Big Data

arXiv.org e-Print Archive

Crossref

Queensland University of Technology ePrints Archive

Exploiting scalable machine-learning distributed frameworks to forecast power consumption of buildings

Author: Apiletti D.
Cerquitelli T.
Malnati G.
Publication venue: 'MDPI AG'
Publication date
Field of study

The pervasive and increasing deployment of smart meters allows collecting a huge amount of fine-grained energy data in different urban scenarios. The analysis of such data is challenging and opening up a variety of interesting and new research issues across energy and computer science research areas. The key role of computer scientists is providing energy researchers and practitioners with cutting-edge and scalable analytics engines to effectively support their daily research activities, hence fostering and leveraging data-driven approaches. This paper presents SPEC, a scalable and distributed engine to predict building-specific power consumption. SPEC addresses the full analytic stack and exploits a data stream approach over sliding time windows to train a prediction model tailored to each building. The model allows us to predict the upcoming power consumption at a time instant in the near future. SPEC integrates different machine learning approaches, specifically ridge regression, artificial neural networks, and random forest regression, to predict fine-grained values of power consumption, and a classification model, the random forest classifier, to forecast a coarse consumption level. SPEC exploits state-of-the-art distributed computing frameworks to address the big data challenges in harvesting energy data: the current implementation runs on Apache Spark, the most widespread high-performance data-processing platform, and can natively scale to huge datasets. As a case study, SPEC has been tested on real data {of an heating distribution network and power consumption data} collected in a major Italian city. Experimental results demonstrate the effectiveness of SPEC to forecast both fine-grained values and coarse levels of power consumption of~buildings

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)